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ABSTRACT: The limited visibility experienced by crane operators in construction sites poses significant 
challenges, leading to reduced performance and safety concerns. Obstructive elements, such as existing buildings, 
construction elements, or vehicles, can block the crane operator's field of view, hindering their ability to execute 
lifting operations with precision and confidence. To address this issue, this study presents a novel approach using 
Building Information Modelling (BIM)-based diminished reality (DR) to enhance visibility by dynamically 
removing obstructive objects from the crane operator's perspective in real-time. The research employs a marker- 
based registration system that effectively aligns BIM data with the physical environment, ensuring realistic and 
precise DR visualization. Additionally, a semi-automatic selection method that involves minimal intervention from 
the user is employed to select desired objects. To generate the background, the system utilizes real-time observation 
data from occluded areas. A validation through a case study demonstrates the practical applicability of the 
developed system in real-life construction scenarios. 
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1. INTRODUCTION 


Construction sites are inherently hazardous environments due to the presence of heavy machinery and large 
equipment. Among these, cranes play a vital role in the construction process. Crane operators may be required to 
operate when they do not have direct visibility of the load, which is referred to as a "blind lift". This type of lift 
has been recognized by the industry as one of the most hazardous activities, as it poses a significant threat to both 
personnel and nearby property. In general, reduced visibility in the working area can lead to lower operator 
efficiency and have an adverse impact on both the end product's quality and overall productivity (Price et al., 2021). 


Diminished Reality (DR) has emerged as an effective solution for overcoming occlusions by recovering 
background scenes and giving an unobstructed view of the workspace. Meanwhile, Building Information Modeling 
(BIM), which is a digital representation of the building geometry and information (ISO, 2015), can be beneficial 
in the DR process. BIM can integrate data from various data-capture technologies, such as laser scanners, Global 
Positioning System (GPS), and imaging sensors, to provide complete data about a construction project 
(Alizadehsalehi & Yitmen, 2016). Considering these features, BIM data can be used to create a digital 
representation of the background scene that is required in the DR process. 


In this study, we investigate the implementation of BIM-based DR to enhance crane operator visibility. Our 
proposed approach aims to facilitate a safer and more efficient construction environment by providing crane 
operators with a clear and unobstructed view of their work area. By utilizing BIM data and DR technology, we 
seek to improve awareness, empower operators to make informed decisions, and to elevate safety within the 
construction industry. The integration of BIM and DR holds the potential to significantly improve crane operations 
and enhance overall productivity and safety at construction sites. 


2. RELATED WORKS 
2.1 Occlusion handling in crane operations 


Various technologies have been developed to handle occlusion and to enhance visibility for crane operators. The 
most widespread approach used by the industry is to ensure that the crane operator remains in constant radio 
communication with either a rigger or a signal person, who can provide guidance throughout the lift. However, 
these methods of communication can be unreliable and cause various accidents (Mansoor et al., 2023). Many 
solutions have been developed to overcome this limitation and improve safety and efficiency at construction sites. 
For example, a crane monitoring system is presented in (Price et al., 2021) that can provide the crane operator with 
real-time 3D visualization and the ability to give and receive feedback during blind lift tasks. In this study, the 
safety warning system is also created based on a 3D model of the crane environment. This 3D model is developed 
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in real-time utilizing sensors, cameras, and laser scanners. An alternative approach involves the visualization of 
information using transparent displays (Sitompul et al., 2020). This information, which includes important details 
such as the height and weight of the lift, is displayed through head-down displays, which are installed near the 
operator's line of sight in the cabin. However, research has indicated that operators often pay minimal attention to 
the information presented on head-down displays (Wallmyr, 2017). This is primarily due to the placement of these 
displays far from the operator's line of sight, as they are positioned in a way that avoids hindering the operators' 
view. A major drawback of using these techniques is that a user is unable to view the information from their own 
point of view. 


2.2 Augmented reality in crane operations 


Augmented Reality (AR) can be used to combine computer-generated information with the user's view of the 
environment. AR systems may give the operator real-time feedback by superimposing valuable information such 
as the load weight, distance to the target, and other crucial data on their field of view, making AR a valuable tool 
for improving visibility for the crane operator, the surrounding area, and the operation to be carried out (Sitompul 
& Wallmyr, 2019). For example, (Yang et al., 2015) developed an AR system to assist operators by providing 
visual information, such as arrows. The findings indicated that the implementation of AR support led to a 
significant reduction in task completion time, as it allowed operators to perceive the environment more clearly and 
effectively. Moreover, it minimized collision frequency and enhanced the overall user experience, demonstrating 
the usefulness of AR in familiarizing operators with new environments (Yang et al., 2015). 


Nonetheless, despite the numerous benefits of AR techniques in crane operations, there are certain limitations that 
can be overcome by Mixed Reality (MR) techniques. One of the disadvantages of AR is that it may suffer from 
limited depth perception and occlusion issues. In AR systems, virtual objects are superimposed onto the user's 
view of the real world, but they may not always appear in the correct position relative to real-world objects, leading 
to misinterpretations and potential hazards (X. Li et al., 2018). 


(H. Li et al., 2022) presents a novel application of MR technology in the form of a night hoisting assistance system, 
highlighting the potential of MR for enhancing visibility and operational safety in crane operations. This system 
enables operators to perceive and interact with a virtual model of the hoisting process in real-time. The system 
offers variety of interaction modalities, including voice interaction, gesture recognition, and gaze tracking, 
allowing operators to intuitively manage and navigate the virtual environment. 


2.3 Diminished Reality 


Diminished Reality (DR), which is an advanced visualization technology for removing or reducing the visibility 
of objects in real-time, can go a step further by visually removing obstructive objects such as buildings, trees, or 
other equipment that may obstruct the operator's view of the workspace (Mori et al., 2017). Thus, DR can provide 
new opportunities for more accurate visualization for operators of heavy machinery such as cranes. (Aromaa et al., 
2020) introduced the concept of DR for generating see-through visualization, allowing the operator to perceive the 
machine's physical structure as transparent from their viewpoint (see Fig. | (a)). Instead of making the machine's 
cabin transparent, (Palonen et al., 2017) developed an alternative method for visualizing the view in front of the 
machine using point clouds (see Fig. 1 (b)). 
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Fig. 1. (a) See-through visualization of the boom presented in (Aromaa et al., 2020) , (b) Visualization of the 


environment using point cloud presented in (Palonen et al., 2017) 
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Implementing effective DR solutions for operator visibility enhancement comes with a set of challenges that 
researchers and developers need to address. The main challenge of DR is obtaining reliable and accurate 
information about the hidden background, especially in dynamic construction sites where the surroundings may 
change frequently. Since DR aims to remove or reduce the visibility of obstructive objects, it requires access to 
real-time observation data or an accurate representation of the background scene to create a seamless visualization. 
Another challenge is the precise alignment of the virtual model with the real-world environment. For effective DR 
visualization, the virtual model must be accurately registered with the physical scene to ensure a seamless blend 
between the two. Achieving precise alignment often requires robust marker-based registration methods or other 
sophisticated tracking techniques, which can be complex to implement and may require specialized hardware or 
software. Furthermore, in scenarios where the background scene is dynamic and constantly changing, maintaining 
real-time updates of the hidden background information becomes critical. The DR system must continuously 
receive and process the latest observation data to accurately reflect any changes in the environment. This real-time 
processing can place significant computational demands on the system, requiring efficient algorithms and powerful 
hardware to handle the data in a timely manner. Overcoming these challenges and creating a seamless DR 
experience for crane operators requires sophisticated data processing techniques and a good understanding of the 
specific requirements of the construction site environment. 


3. PROPOSED DR SYSTEM 


The proposed system for enhancing crane operator visibility using BIM-based diminished reality allows for the 
seamless alignment of physical and virtual scenes, enabling the visualization of occluding objects and their 
removal from the crane operator's view in an MR environment. This approach aims to enhance visibility, safety, 
and situational awareness for crane operators in real-life construction scenarios. 


Using our proposed system, the crane operator, who controls the overhead crane from the shop floor and is 
equipped with a head-mounted display, interacts with the system using hand gestures to visually remove the 
sections where obstructive objects are present. The process begins with scanning the QR code markers, followed 
by alignment of the 3D virtual model onto the physical scene. Subsequently, specific objects within the virtual 
model can be selected. Afterward, the system seamlessly integrates the real-time video feed from CCTV cameras, 
showing the dynamic real-time background and further enhancing the operator's field of vision. 


The system architecture consists of three main layers, as illustrated in Fig. 2. The first layer involves data collection. 
The BIM model provides additional contextual information, such as the physical layout of the construction site, 
the positions of obstructive objects, the dimensions, and characteristics of the crane. The laser scanning system in 
combination with the BIM model can create the initial static 3D environment map. Accurate placement of QR 
code markers in the model ensures precise registration and tracking. Subsequently, real-time data is collected from 
video streams captured by CCTV cameras placed strategically in the environment. A data integration and 
processing layer includes both the alignment and DR processing modules. In this layer, the aligned BIM model is 
integrated with the real-time observation data from the CCTV cameras. Through this integration, the dynamic 
updating of the DR visualization is achieved, ensuring a seamless and accurate representation of the background 
scene. The last layer involves the visualization of the enhanced scene in a MR environment. In our implementation, 
a Microsoft HoloLens 2 headset was utilized to present the MR visualization to the crane operators, allowing them 
to perceive the virtual and physical elements seamlessly. The visualization module provided the crane operator 
with an enhanced and contextually accurate representation of the construction site. The headset's advanced hand 
and gesture recognition capabilities enabled precise and responsive interaction with the MR environment. Crane 
operators could easily manipulate and navigate the virtual content using natural hand gestures, allowing for 
efficient and fluid control over the DR visualization. 
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Fig. 2. System architecture for enhancing crane operator visibility 


The system is implemented using C# programming in the Unity 3D environment. Unity 3D facilitates integration 
of video streams into a virtual environment. The OpenCV framework is employed to execute real-time image 
processing algorithms on the frames, enabling efficient and responsive operations. Additionally, a Wi-Fi 
comnection is established between the HMD (Head Mounted Display) and the CCTV cameras. This wireless 
network connection allows seamless video streaming to the HMD, ensuring real-time visualization of the 
environment. 


Fig. 3 shows the process flow of the generated prototype system, which will be elaborated in the following 
subsections. 
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Fig. 3. DR process flow 
3.1 Virtual Model Generation 


First, the virtual model of the environment is generated using the BIM model in combination with reality capture 
techniques. The 3D scanner plays a vital role in reality capture by providing highly accurate and precise point 
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clouds of the environment. It captures intricate details and geometries, ensuring that the virtual model is an accurate 
representation of the real-world site. Compared to the Structure from Motion (SfM) method (utilized in (Inoue et 
al., 2018)), which heavily relies on the quality and number of images taken to reconstruct the background, the 3D 
scanner captures data directly from the environment, minimizing the dependency on photo quality and providing 
a more robust solution. Furthermore, accurate scene reconstruction can be highly challenging when applying the 
SfM method to complex construction environments. The 3D scanner, with its high precision, can handle such 
complex environments more effectively, leading to a more reliable and detailed background generation. Then, 
Autodesk 3ds Max is used to create a high-quality 3D model of the scene using the BIM model and point cloud 
data. This 3D virtual model can then be optimized and converted into a low-polygon model suitable for real-time 
rendering in game engine environments. In addition, the location of the QR code is defined in the virtual model in 
this step for the marker-based registration. The physical QR code marker is subsequently placed in the correct 
location within the environment. 


3.2 Virtual Model Alignment 


As indicated in the previous subsection, to adequately align the virtual model in the physical world, QR code 
markers are used. A Microsoft HoloLens 2 headset tracks the camera's position, ensuring accurate alignment of 
the virtual and physical scenes. Vuforia image target technology (Vuforia Enterprise Augmented Reality (AR) 
Software | PTC) plays a crucial role in achieving precise alignment between the virtual and physical scenes in this 
study. As the user scans the QR code marker using a headset equipped with the Vuforia engine, the system identifies 
the unique image target and establishes a reference point. By recognizing the image target (the QR code marker 
stored in the Vuforia database), Microsoft HoloLens gains an understanding of its position and orientation in the 
real-world environment. 


3.3 Object Selection 


Users can interact with the virtual model by selecting objects they wish to remove from their view. Upon selection, 
information about the object, including its metadata transferred from the IFC model, is displayed in the user's view. 
This interactive process allows for a more user-friendly and intuitive experience. 


3.4 Object Removal 


The process of object removal involves several steps in an MR environment. First, the system captures real-time 
video streams from CCTV cameras, which provide a view of the target environment, including obstructive objects. 
Using the interactive HoloLens interface, the operator can select a region of interest, which includes obstructive 
objects like walls. The frames captured by the CCTV camera are transmitted in real-time, accompanied by 
annotation information, including the camera's pose at the time of each frame. After any distortions are repaired, 
these frames are decoded and uploaded as textures to the GPU (Graphics Processing Unit) of the headset device, 
enabling the generation of a DR view. The image warping process is then initiated, identifying corresponding 
points between the selected region in the operator's view and the frames coming from the real-time video stream. 
By calculating a transformation matrix based on these points, the system precisely aligns the background view 
with the real-world environment from the crane operator's perspective. As a result, the selected obstructive objects 
are visually replaced with the corresponding background from the virtual model. The HoloLens application renders 
this augmented view, providing the crane operator with an unobstructed and clear representation of the 
environment. The entire process happens in real time, updating when the crane operator moves or changes their 
perspective, resulting in better awareness of the situation and informed decision-making during complex lifting 
operations. 


3.5 Visualization 


The final MR visualization, presented through the headset, seamlessly combines real-world observation data from 
CCTV cameras with the DR-processed view. Obstructive objects, previously removed using image warping, are 
no longer present in the operator's field of view, ensuring an unobstructed and clear perspective. This MR 
visualization empowers the crane operator with real-time and accurate information. 


4. CASE STUDY 


In this case study, we conducted initial steps for the validation of our developed system in a real-world setting at 
a prefabrication factory's shop floor located in Montreal, Canada. The manufacturing of prefabricated modules is 
done on the factory’s production floor, with distinct zones and the presence of cranes for material handling (see 
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Fig. 4 (a)). The type of crane used in this study is an overhead crane, which is defined by its ability to move along 
rails that are located overhead, thereby offering flexibility in the lifting and handling of material. The system's 
capabilities in improving operator visibility and safety during crane operations are the focus of this case study. The 
crane operator is equipped with a remote controller to manipulate the crane's movements and operations. The 
modules are placed in close proximity on the shop floor due to a lack of space. The operator's viewpoint is 
obstructed by this setup, which reduces their ability to see crucial parts such as the hook of the crane. In these 
situations, the operator requires the presence of additional workers near the hook to help manage the entire 
operation. The integration of the overhead crane with the proposed BIM-based DR system provides a solution to 
overcome the challenges of limited visibility faced by crane operators. As shown in Fig. 4 (c), the CCTV cameras 
were strategically placed around the module. Fig. 4 (b) shows the 3D virtual model of the prefabricated module. 


(a) (b) (c) 


Fig. 4. Experimental area; (a) Physical factory environment, (b) Virtual model of one module, and (c) Factory 
and camera settings in Unity environment. 


Point clouds of the environment are collected by Leica Cyclone REGISTER 360, as illustrated in Fig. 5 (a), (b). 
The point cloud in combination with BIM model helps us to generate a low-polygon virtual model of the scene 
(shown in Fig. 4 (b)). The process began by placing a QR code marker at the same location in the physical scene 
as in the virtual model. When the crane operator wore the HoloLens and scanned the QR code marker, the 
HoloLens accurately tracked the camera's position and orientation, ensuring precise alignment between the virtual 
and physical scenes. 


(a) (b) 


Fig. 5. Point clouds of the environment collected by Leica Cyclone REGISTER 360; (a) Scanning stations; (b) 
Point cloud data of the target module. 


Fig. 6 (b) illustrates the final result of the DR process within the HoloLens 2 environment through a screenshot of 
the user interface. In the screenshot, the actual prefabrication shop floor is displayed, and obstructive objects are 
highlighted as regions of interest. Crane operators can use the HoloLens 2's gesture recognition capabilities to 
select specific obstructive elements by drawing regions of interest around them using natural hand movements. 
Once the regions of interest are selected (red dash line in Fig. 6 (b)), the DR visualization algorithm processes the 
data in real-time to remove the obstructive objects from the operator's view. 


Fig. 6 (a) shows the operator's view prior to the application of the DR process, providing as a reference point for 
the visual change affected by DR, as shown in Fig. 6 (b). 
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SECTION A - EXTENDED REALITY TECHNOLOGIES IN CONSTRUCTION 
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Fig. 6. HoloLens view, :(a) before the DR process; (b) after the DR process. 


5. DISCUSSION 


The application of BIM-based DR for enhancing crane operator visibility presents a promising solution for 
addressing challenges related to obstructed views in construction sites. The proposed system effectively combines 
advanced visualization technologies and real-time data integration to provide crane operators with a clearer and 
unobstructed view of their operational scene. 

The integration of BIM data and real-time observation from CCTV cameras improves the DR visualization's 
accuracy. This integration tackles the issue of reliable background information by providing a continually updated 
view of the surroundings via real-time video streams. This ensures that crane operators are provided with a realistic 
and up-to-date representation of the construction area. 


Despite the demonstrated effectiveness of the developed system, some limitations and challenges have been 
identified. The accuracy and reliability of the DR visualization heavily rely on the quality and availability of real- 
time observation data. In addition, factors such as changing lighting conditions and dynamic physical environment 
can influence the accuracy of tracking and registration. Future research can explore potential solutions, such as 
leveraging advanced imaging technologies or integrating cutting-edge technologies, including sensors and cloud 
solutions. For example, sensors such as LiDAR (Light Detection and Ranging) can be used to create detailed 3D 
maps of the environment to help operators in navigating complex environments, detecting obstacles, and 
improving situational awareness. Position sensors, such as GPS (Global Positioning System), can precisely track 
the crane's location. Cloud solutions can also be used for data storage and accessibility, providing a centralised and 
secure repository for storing large amounts of sensor data such as photos, videos, and sensor readings. In addition, 
cloud-based analytics tools can process sensor data in real-time, providing valuable insights to crane operators. By 
overcoming these challenges, we can further enhance the precision and reliability of the system, opening up new 
possibilities for improved crane operator visibility and safety at construction sites. 


6. CONCLUSION 


This research investigated a BIM-based DR approach to enhance crane operator visibility and safety at 
construction sites. By dynamically removing obstructive objects in real-time, the proposed system offers crane 
operators an unobstructed view of the construction scene, significantly improving their visibility and decision- 
making. The seamless integration of BIM data and real-time observation data enables a realistic and accurate DR 
visualisation. While our developed system shows promising results, further investigation is needed to address 
limitations such as the quality of real-time observation data and challenges related to registration and tracking. 
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